Genetics and population analysis QuASAR: quantitative allele-specific analysis of reads
نویسندگان
چکیده
Motivation: Expression quantitative trait loci (eQTL) studies have discovered thousands of genetic variants that regulate gene expression, enabling a better understanding of the functional role of non-coding sequences. However, eQTL studies are costly, requiring large sample sizes and genome-wide genotyping of each sample. In contrast, analysis of allele-specific expression (ASE) is becoming a popular approach to detect the effect of genetic variation on gene expression, even within a single individual. This is typically achieved by counting the number of RNA-seq reads matching each allele at heterozygous sites and testing the null hypothesis of a 1:1 allelic ratio. In principle, when genotype information is not readily available, it could be inferred from the RNAseq reads directly. However, there are currently no existing methods that jointly infer genotypes and conduct ASE inference, while considering uncertainty in the genotype calls. Results: We present QuASAR, quantitative allele-specific analysis of reads, a novel statistical learning method for jointly detecting heterozygous genotypes and inferring ASE. The proposed ASE inference step takes into consideration the uncertainty in the genotype calls, while including parameters that model base-call errors in sequencing and allelic over-dispersion. We validated our method with experimental data for which high-quality genotypes are available. Results for an additional dataset with multiple replicates at different sequencing depths demonstrate that QuASAR is a powerful tool for ASE analysis when genotypes are not available. Availability and implementation: http://github.com/piquelab/QuASAR. Contact: [email protected] or [email protected] Supplementary information: Supplementary Material is available at Bioinformatics online.
منابع مشابه
QuASAR: quantitative allele-specific analysis of reads
MOTIVATION Expression quantitative trait loci (eQTL) studies have discovered thousands of genetic variants that regulate gene expression, enabling a better understanding of the functional role of non-coding sequences. However, eQTL studies are costly, requiring large sample sizes and genome-wide genotyping of each sample. In contrast, analysis of allele-specific expression (ASE) is becoming a p...
متن کاملThe Comparative Analysis of the Allele Pool of Thoroughbred Horses in Different Countries
The aim of the present study was the conducting of comparative analysis of allele pool of Ukrainian population of Thoroughbred horses and the populations from England, USA, Russia and South Korea using microsatellite loci of DNA on the basis of our own researches and literary sources. Comparative analysis of allele pool of Thoroughbred populations from different countries was conducted using 6 ...
متن کاملQuASAR-MPRA: accurate allele-specific analysis for massively parallel reporter assays
Motivation The majority of the human genome is composed of non-coding regions containing regulatory elements such as enhancers, which are crucial for controlling gene expression. Many variants associated with complex traits are in these regions, and may disrupt gene regulatory sequences. Consequently, it is important to not only identify true enhancers but also to test if a variant within an en...
متن کاملUGT1A1 gene linkage analysis: application of polymorphic markers rs4148326/rs4124874 in the Iranian population
Objective(s): Mutations in the UGT1A1 gene are responsible for hyperbilirubinemia syndromes including Crigler-Najjar type 1 and 2 and Gilbert syndrome. In view of the genetic heterogeneity and involvement of large numbers of the disease causing mutations, the application of polymorphic markers in the UGTA1 gene could be useful in molecular diagnosis of the disease. Materials and Methods: In the...
متن کاملHeterozygosis deficit of polymorphic markers linked to the β-globin gene cluster region in the Iranian population
Objective(s): Iran is considered as one of the high-prevalence areas for β-thalassemia with a rate of about 10% carrier frequency. Molecular diagnosis of the disease is performed both by direct sequencing and indirectly by the use of polymorphic markers present in the beta globin gene cluster. However, to date there is no reliable information on the application of the markers in the Iranian pop...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015